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This paper studies the identification of the Levy jump measure of 
a discretely-sampled semimartingale. We define successive Blumenthal- 
Getoor indices of jump activity, and show that the leading index can 
always be identified, but that higher order indices are only identi- 
fiable if they are sufficiently close to the previous one, even if the 
path is fully observed. This result establishes a clear boundary on 
which aspects of the jump measure can be identified on the basis of 
discrete observations, and which cannot. We then propose an estima- 
tion procedure for the identifiable indices and compare the rates of 
convergence of these estimators with the optimal rates in a special 
parametric case, which we can compute explicitly. 

1. Introduction. Let X be a one-dimensional semimartingale defined on 
a finite time interval [0,T]. Our objective is to make some progress toward 
the identification of the jump measure of X at high frequency. The moti- 
vation for what follows has its roots in a family of econometric problems, 
which can be stated as follows. We observe a single path of X, but not 
fully: although other observation schemes are possible, the most typical is 
one where we observe the variables Xi& n for % = 0, 1, . . . , [T/A n ], where [x] 
denotes the integer part of the real x, over a fixed observation span T and 
where A n is small. Asymptotic results are derived in the high-frequency 
limit where the sequence A n going to 0. The overall objective is to find out 
what can be recovered, that is, identified, about the dynamics of X, in this 
setup where a single path, partially observed at a discrete time interval, is 
all that is available. For those parameters which can be identified, we also 
want asymptotically consistent estimators, with a rate whenever possible. 
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For the dynamics of X, we restrict our attention to Ito semimartingales, 
meaning that the characteristics {B,C,u) of X can be written as follows: 



for some adapted processes bt and q and measure Ft{u,dx). Recall that B 
is the drift, C is the quadratic variation of the continuous martingale part 
and v is the compensator of the jump measure /i. of X [see Jacod and 
Shiryaev (2003) for more details on characteristics]. As is well known, these 
are the canonical models for arbitrage- free asset prices. 

A sizeable part of the paper, however, is concerned with the much-restrict- 
ed class of Levy processes. A semimartingale X is a Levy process if and only 
if (1) holds with bt(co) = b € M and = c > and F t {uj,dx) = F(dx) in- 
dependent of uj and t. The measure F is the Levy measure, and it integrates 
x 2 A 1. The (deterministic) triple (b,c,F) is then the characteristic triple 
coming in the Levy-Khintchine formula, providing the characteristic func- 
tion of X t , 



This completely characterizes the entire law of X. 

Ultimately, we would like to identify as much as we can of the char- 
acteristics B, C and v, and give consistent estimators for the identifiable 
parameters. The situation is well understood for the first two characteris- 
tics, B and C. When X is fully observed on [0,T], one knows the jumps 
(size and location) occurring within the interval, and the quadratic varia- 
tion of X on [0, T], hence the function £i— >■ Ct on [0, T\. On the other hand, 
and at least when C is strictly increasing (which is the case in almost all 
models used in practice), nothing can be said about the drift B. When the 
process is observed only at discrete times, Ct is no longer exactly known, 
but there are well established methods to estimate it in a consistent way as 
the observation mesh goes to 0, even in the presence of jumps. 

We focus on the remaining open question, which concerns identifiability 
and estimation for the third characteristic, v, or equivalently Ft, for a dis- 
cretely sampled semimartingale. The measure Ft in a sense describes the law 
of a jump occurring at time t, conditionally on the past before t. There is 
a vast literature on identifying the Levy measure when the time horizon T is 
asymptotically infinite, and when X is a Levy process; see, for example, Ba- 
sawa and Brockwell (1982), Figueroa-Lopez and Houdre (2006), Nishiyama 
(2008), Neumann and Reiss (2009) and Comte and Genon-Catalot (2009). 



(1) 
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But over a finite time horizon T, we cannot reconstruct v fully because there 
are only finitely many jumps on [0,T] with size bigger than any e > 0. The 
open question which we seek to address in this paper is: what can we and 
can we not identify about vl High-frequency data analysis has proved a very 
fruitful area of research. As we will see, however, it is not able to achieve 
everything, and our objective in this paper is to pinpoint exactly the limi- 
tations, or frontier, involved in using high-frequency data over a fixed time 
span. 

We can say something about the concentration of v around 0. For ex- 
ample, we can decide for which p > we have Jq ds J F t (uj,dx)(\x\ p A 1) < 
oo, because outside a null set again these are exactly those p's for which 
Y1 S <T \AX s (u)\ p < oo, where AX S = X s — X s _ is the size of the jump at 
time s, if any. The infimum of all such p's is a generalization of the Blumenthal- 
Getoor index (or BG index) of the process up to time T, and it is known 
when X is fully observed. Note that a priori it is random, and also increas- 
ing with T, and always with values in [0,2]. However, in the Levy process 
case, it reduces to inf{p: j F(dx)(\x\ p A 1) < oo}, and is nonrandom and in- 
dependent of time. It was originally introduced by Blumenthal and Getoor 
(1961), and for a stable process the BG index is also the stability index of 
the process. 

The interest in identifying the BG index lies in the fact that the index 
allows for a classification of the processes from least active to most active: 
processes with BG index equal to are either finitely active or infinitely 
active but with slow, sub-polynomial, divergence of v near 0; processes with 
BG index strictly positive are all infinitely active; processes with BG index 
less than 1 have paths of finite variation; processes with BG index greater 
than 1 have paths of infinite variation; and in the limit, processes with 
continuous paths have an "activity index" (the analog of the BG index 
which no longer exists) equal to 2 when the volatility is not vanishing. In 
other words, jumps become more and more active as the BG index increases 
from to 2, and we can think of this generalized BG index as an index of 
jump activity. 

In the case of discrete observations at times iA n with A n going to 0, 
recovering the random BG index in full generality seems out of reach, but 
Ait-Sahalia and Jacod (2009a) constructed estimators of the nonrandom 
number (3 that are consistent as A n — > 0, under the main assumption that 
locally near 0, we have the behavior 

(3) F t (u;,[-u,u} c )~^^ asn|0 

(plus a few technical hypotheses), where at > is a process: in this case, 
f3 is the — deterministic — BG index at time t, on the set {J a s ds > 0}. We 



4 



Y. AIT-SAHALIA AND J. JACOD 



call this behavior "proto-stable," since it is similar to that of a stable pro- 
cess but only near 0. Away from a neighborhood of 0, the jump measure is 
completely unrestricted. We obtained the rate of convergence and a central 
limit theorem for the estimators, depending upon the rate in the approxi- 
mations (3). Related estimators or tests for f5 include Belomestny (2010), 
Cont and Mancini (2011) and Todorov and Tauchen (2010). 

We can think of (3) as providing the leading term, near 0, of the jump 
measure of X. Given that this term is identifiable, but that the full measure v 
is not, our aim is to examine where the boundary between what can versus 
what cannot be identified lies. Toward this aim, one direction to go is to 
view (3) as giving the first term of the expansion of the "tail" Ft(u), [— u,u] c ) 
near 0, and go further by assuming a series expansion such as 

(4) Ft (w,[-«,«] c )~y;^ asn|0 

i>l 

(the precise assumption is given in Section 2), with successive powers fa = 
/3 > 02 > 03 > ■ ■ ■ . Those fa's will be the "successive BG indices." This series 
expansion can, for example, result from the superposition of processes with 
different BG indices, in a model consisting of a sum of such processes. 

The question then becomes one of identifying the successive terms in that 
expansion. The main theoretical result of the paper, which is somehow sur- 
prising, is as follows: the first index fa is always identifiable, as we already 
knew, but the subsequent indices fa which are bigger than fa/2 are identifi- 
able, whereas those smaller are not. An intuition for this particular value of 
the "identifiability boundary" is as follows: in view of (4) the estimation of 
the fa's can only be based on preliminary estimations of Ft(oj, [— u,u] c ), or 
of an integrated (in time) version of this, for a sequence u n — > 0. It turns out 
that, even in idealized circumstances, an estimation of F t (uj, [— u n , u n ] c ) or 

of its integrated version has a rate of convergence u n (there is a central 
limit theorem for this), so that any term contributing to Ft(ui, [— u n , u n ] c ) 

by an amount less than u n is fundamentally unreachable: we can only 
hope to estimate a further coefficient fa if it leads to a number of increments 
greater than u n (which is of order Un^ 1 ) that is larger than the sampling 
error in the number of terms generated by the first coefficient, implying that 
any fa < fa/2 cannot be identified. This shows that there are limits to our 
ability to identify these successive terms, even in the unrealistic situation 
where the process is fully observed, and the behavior of v around is only 
partly identifiable. 

When the identifiability conditions are satisfied, and when the process is 
observed at discrete times with mesh A n , we will construct estimators of 
the parameters which are consistent as A n — > 0, and determine their rate of 
convergence, which we will see are slow. In the case we have only two indices 
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Fig. 1. Two BG component model: regions where the components are identified versus 
not identified, and optimal rate of convergence. 



fa > fa with fa > fa/2, we will further compare the rates of the estimators 
we exhibit, which are semiparametric, to the optimal rate achievable in 
a corresponding parametric sub-model (the sum of two stable processes, 
plus a drift and a Brownian motion). 

The main results of the paper are summarized in Figure 1 for the two- 
component situation. We already noted that fa can be identified only if it is 
bigger than fa/ 2; we will also see that the rate at which fa can be estimated 
increases as fa gets closer to fa, and conversely decreases as fa gets closer 
to fa/2, in the limit dropping to as fa approaches fa/2, consistently 
with the loss of identification that occurs at that point. Beyond the two- 
component model, we will provide general identifiability conditions and rates 
of convergence for the leading and higher order BG indices. 

The paper is organized as follows. We first define the successive BG in- 
dices in Section 2. In Section 3, we study the identifiability of the parameters 
appearing in the expansion, from a theoretical viewpoint and in the special 
case of Levy processes. Then we introduce consistent estimators for those 
parameters which we have found to be identifiable in the Levy case, hence 
proving de facto their identifiability. This is done according to a two-step 
procedure, with preliminary estimators given in Section 4, and final estima- 
tors with much faster rates in Section 5. Unfortunately, although rates are 
given, we were not able to show a central limit theorem for these estima- 
tors, although such theorems ought to be available and would be crucial for 
obtaining confidence bounds. 

In principle, those estimators could be used on real data, but the rates 
of convergence for the higher order indices are, by necessity, quite slow. We 
show in Section 6 that the slow nature of these rates of convergence is an 
inherent feature of the problem that cannot be improved upon. This is per- 
haps not too surprising since the range of values of the higher order indices 
that are identified is limited, and hence one would expect the rate of con- 
vergence to deteriorate all the way to zero as one approaches the region 
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where identification disappears. We provide in Section 7 a simulation study 
for a model featuring a stochastic volatility plus two stable processes with 
different indices, the aim being to identify these two indices, especially the 
higher order one. A realistic application to high-frequency financial data, is 
out of the question for the typical sample sizes that are currently available, 
but may be useful in the future or in different fields of applications where 
semimartingales are used and where data are available in vast quantities, 
such as the study of Internet traffic or turbulence data in meteorology. The 
results do also present theoretical interest, especially as they set up bounds 
on what is asymptotically identifiable in the jump measure of a semimartin- 
gale, and consequently what is not. 

2. The successive Blumenthal Getoor indices. Throughout the paper, 
X is an ltd semimartingale with characteristics given by (1), on a filtered 
probability space (fi,.F, (J r t)t>o^)- The time horizon for the observations is 
T > 0, so the behavior of X after time T does not matter for us below. 

Our first aim is to give a precise meaning to an hypothesis like (4). In- 
stead of requiring an expansion like this for all times t, we rather use the 
"integrated version" which uses the following family of (adapted, continuous 
and increasing) processes: 

(5) u>0 => A{u) t = [ F s ([-u,u] c )ds. 

Jo 

The basic assumption is as follows: 

Assumption 1. There are a nonrandom integer j, a strictly decreasing 
sequence (A)i<i<j+i °f numbers in [0,2) and a sequence (^4 l )i<j<j + i of 
processes such that 

(6) te[0,T], 0<u<l => 

Moreover, we have A l T > for i = 1, . . . , j. 

If this assumption is satisfied with some j > 2, it is also satisfied with any 
smaller integer. The processes A 1 and A' are nondecreasing nonnegative, and 
they can always be chosen to be predictable. 

Clearly, f3 = {3\ is the BG index, as introduced before, and the following 
definition comes naturally in: 

Definition 1. Under Assumption 1, the numbers /3i,/32, ... ,/3j are called 
the successive BG indices of the process X over the time interval [0,T], and 
the variables A l T are called the associated integrated intensities. 
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Example 1. Let Y 1 , . . . ,y j be independent stable processes with in- 
dices Pi_> ■■■> fy. Then X = Y 1 H satisfies (6) with A^ +1 = 
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and the successive indices and integrated intensities are and Ten, where 
Oj = lim u ^o u /3i F l ([—u,u] c ), and F % is the Levy measure of Y % . 

If the Y l, s are tempered stable processes [see Rosihski (2007)] the same 
is true, provided /3j > fi\ — 1. 

Example 2. A semimartingale consisting of a continuous component 
and a jump part driven by a sum of such processes also satisfies (6). Let Xt = 
Xq + Z t + X^i=i Jo > with ^ a continuous Ito semimartingale and Y* 

as in the previous example and H l locally bounded predictable processes 
with Jq \Hg\^ ds > 0. The successive BG indices are again the /%'s, with the 
associated integrated intensities 

r T 

M r = a i I \H l /*ds. 
Jo 

Remark 1. We have taken a finite family of possible indices /3j. Nothing 
prevents us from taking an infinite sequence: we simply have to assume that 
Assumption 1 holds for all j, with additionally lim^oo,^ = 0. However, in 
view of the restriction imposed on the BG indices by our main theorems 
below about identifiability, this more general situation has no statistical 
interest. 

Remark 2. Assumption 1 imposes a certain structure on the behavior 
of the jump measure of the process near 0. It is important to note that it 
does not restrict in any way the behavior of the jump measure away from 0. 
Although most models used in practice and with infinite activity jumps 
satisfy this assumption, the Gamma process does not: although it (barely) 
exhibits infinite activity, its BG index is 0, and A{u)t is of order log(l/u). 

In Assumption 1, expansion (6) is central, but one may wonder about the 
additional requirement A l T > 0. So, we end this section with some comments 
and extensions, which may look complicated and are not necessary for the 
rest of the paper, but which we think are useful and somewhat enlightening. 

Extension 1. In Assumption 1 positive and negative jumps are treated 
in the same way. In practice, it might be useful for modeling purposes to 
establish the behavior of positive and negative jumps separately. Toward 
this end, one can replace (5) by 

rt ft 
A(u) { t +) = F s ((u,oo))ds, ^(n)S +) =/ F s ({-oo,-u))ds. 
Jo Jo 

Then, if one is interested in positive jumps only, say, one replaces (6) by 
a similar expansion for A(u)[ + ^: all the content of the paper still holds, 
mutatis mutandis, under this modified assumption, for positive jumps. The 
same is true of negative jumps, and the "positive" and "negative" successive 
BG indices can of course be different. 
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Extension 2. Now we come to the requirement A l T > 0, which in As- 
sumption 1 is supposed to hold for all (or, almost all) oj. This is of course 
unlikely to hold for the terminal time T, unless it holds for all t > 0, and even 
unless the processes A % are strictly increasing. In Example 2, this amounts 
to suppose that none of processes H l vanishes. However, it might be relevant 
in practice to allow for each H l to vanish on some (possibly random) time 
intervals: we then can have different components of the model turned on and 
off at different times. 

Thus, let us examine what happens if we relax the requirements A l T > 0. 
For any particular outcome u, the (first) BG index of the process X is 
where i is the smallest integer such that A l T > 0, and if all of them vanish 
one only knows that the BG index is not bigger than The same applies 
to further indices. In other words, one can define a partition of f2 indexed 
by all subsets D of {1, ... ,j} as follows: 




Then, for any u, the successive BG indices of X over [0, T] and the associated 
intensities are the numbers /3[(u}), . . . ,f3'j(ui) and Tj(w), defined as 

r i (oj) = A l t i (oj) ifojen t ({h,...,i m }). 

On the set Oy(0), which is not necessarily empty, we have J = and no /3^'s. 

All results of this paper are true if we relax A l T > in Assumption 1, 
provided we replace j by J and the fy's by the in restriction to the 
set Qt(D): this is indeed very easy, because on this set the process X co- 
incides at all times t S [0, T] with a process X' with satisfies Assumption 1 as 
stated above, with (j, fix,. . ., /3j,f3j + i) substituted with (m, /9j 1 , . . . , f3j m ,/3j + i), 
when D = {h,...,l m }. 

3. Identifiability in the Levy case. Loosely speaking, in an asymptotic 
statistical framework, identifiability of a parameter means the existence of 
a sequence of estimators which is (weakly) consistent. Identifiability can be 
"proved" by exhibiting such a sequence. It can be "disproved" by theoretical 
arguments, such as the fact that if the parameter is identifiable in our high- 
frequency observations setting, then, were the path 1 1— > X t fully observed 
on [0,T], it would enjoy "nonasymptotic" identifiability in the sense that its 
value is almost surely known. For example, in the simple model Xt = bt + Wt 
the parameter b does not enjoy this nonasymptotic property because the laws 
of the process X (restricted to [0,T]) are all equivalent when b varies, and 
thus b is even less identifiable in the asymptotic setting. 
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Disproving identifiability is usually a hard task, especially in a nonpara- 
metric setting. However, if a parameter is not identifiable for a certain class 
of models, it is of course not identifiable for any wider class. 

These arguments lead us to consider the very special situation of a Levy 
processes X, with Levy-Khintchine characteristics (b, c, F) [see (2)] when the 
path t \— > Xt is fully observed on [0, T]. In this section we are interested in 
nonasymptotic identifiability of those characteristics, or functions of them. 
Note that, were T infinite, the triple (b,c,F) would be identifiable because, 
for example, one would know the values of all the i.i.d. increments X n+ \ — 
X n , giving us almost surely the law of X%, which in turn determines the 
triple (6, c, F). 

This is no longer the case when, as in this paper, the time interval [0,T] 
is finite. In this case, we give a formal definition of identifiability. We use 
Qb,c,F to denote the law of the process X, restricted to the interval [0,T] 
(T is kept fixed all throughout). So Qb,c,F is a probability measure on the 
Skorokhod space D =B(|0,T],R). We also let T be some given subset of all 
possible triples (b,c,F). 

Definition 2. A function H is identifiable on the class T if, for any 
two (b,c,F) and (b',c',F') in T such that H(b', c', F') / H(b, c, F), we have 
Qb, c ,F -L Qb',c'.F' (*.e.j the two measures Qb, c .F and Qv.c'.F' are mutually 
singular). 

The rationale behind this definition is as follows: if H is identifiable and 
(b,c,F) £ T, and X is drawn according to the law Qb,c,F-, then we can dis- 
card with probability 1 any fixed (b',c',F') £ T such that H(b',c',F') ^ 
H(b, c, F). Unfortunately, this does not mean that we can (almost surely) re- 
ject all (b',c',F') with H(b' ,d , F') ^ H(b,c,F) simultaneously: this stronger 
property is (almost) never satisfied. 

There exists a criterion for mutual singularity of Qb,c,F and Qb',c',F'] see 
Remark IV.4.40 of Jacod and Shiryaev (2003). We have a Lebesgue de- 
composition F' = f • F + F' 1 - of F' with respect to F, with / a nonnega- 
tive Borel function and F' 1 - a measure supported by an F-null set. Then 
Qb', c ',F' -L Qb,c,F if and only if at least one of the following five properties is 
violated: 

'F /± (R)<oo, 
a(F, F') = j (|/0r) - 1| 2 A \f(x) - l\)F(dx) < oo, 

(9) {a'(F,F')=[ \x\\f(x) - l\F(dx) < oo, 

J{\x\<l} 



c = =► b' = b- x(f(x)-l)F(dx) 
J{\x\<l} 



c' = c. 
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It clearly follows that the function H(b,c,F) = c is identifiable on any 
class T (a well-known fact). The function H (b, c,F) = b is not identifiable in 
general; however, on the class of all (b, c, F) having c = and fn^n \%\F(dx) < 

oo the function H (b, c, F) = b = b — fn x \<i\ xF(dx) (which is the "real" drift, 

in the sense that X t = bt + ^2 s<t AX S ) is identifiable. 

In the sequel we are not interested in b or c, but in F only. That is, we are 
looking at functions H = H(F). This leads us to consider classes of the form 

(10) T = Rx R + x 75 where 75 is a set of Levy measures. 

In words, we want no restriction on the parameters b and c. Of course 75 
should not be a singleton, and H{F) should not be constant on 75, otherwise 
the identifiability problem is empty. 
The following example is clear: 

Example 3. If 75 is a set of measures which coincide with some given F 
on a neighborhood of 0, then by (9) no nontrivial H(F) is identifiable on T ' ■ 

This implies that, in the best-case scenario, a function H(F) can be iden- 
tifiable only if it depends on the "behavior of the measure F around 0." Giv- 
ing a necessary and sufficient condition for identifiability of such a function, 
other than saying that one of the properties in (9) fails when H(F) ^ H(F'), 
seems out of reach. However, this is possible for some specific, but relatively 
large, classes of sets 75, with a priori relatively surprising results. Below 
we introduce such a class, in order to illustrate the nature of the available 
results. 

Definition 3 (The class of Levy measures). We say that a Levy 
measure F belongs to this class if we have 

~ °° a/3- 

F{dx) = F{dx) + I — \i+b- M-ViV] ( x ) where r\ > and 

i=l ' x ' 



ill) 



(i) < &+i < ^ < 2, ft>0 A>ft+i, 

lim ^ = 0, 

i— ¥oo 

(ii) di>0 <^ Pi>0, 



(hi) < Oj < oo, 



i=l 



(iv) F is a finite measure supported by [—7], ,r]] c . 

Parts (i) and (ii) together ensure the uniqueness of the numbers (aj,/3j) 
in the representation of F, whereas if this representation holds for some 
rj > 0, it also holds for all rf S (0, 77), with the same (a^, Pi). Part (hi) ensures 
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that the infinite sum in the representation converges, without being zero (so 
equivalently, a± > 0, or j3\ > 0). 

The class 7^ contains all sums of symmetric stable Levy measures. On 
the other hand, it is contained in the class of all Levy measures F of a Levy 
proc< 
that 

(12) u<\ 



(2) 

process satisfying Assumption 1: the latter is the class 7? of all F such 



a' 



> f([-mF)-E| 

i=l 

for 2 > fti > • ■ ■ > /3j+i > and a, > for i = 1, .. . ,j and a' > 0, and those 
conditions are implied by (11), for any j < sup(i : /3j > 0), with the same 
and aj. 

Considering a» and /3j as functions on 7$ , the identifiability result goes 
as follows: 

Theorem 1. In the previous setting, the following holds: 

(i) The functions (3\ and a\ are identifiable on the set T% ■ 

(ii) For any given i>2, the functions (3i and ai are identifiable on the 

subset lf-\i) = {F £ 7^ (1) :ft(F) > Pi{F)/2} of T 3 {1) , and they are not on 
the complement To \ T^ l \i). 

Remark 3. As mentioned in the "first extension" described in the pre- 
vious section, a similar statement is true if we replace the first line of (11) 
by 

°° /a (+) /3 (+) a { ~ ) /3 { ~ ) \ 

F(dx) = F(dx) + ^3-1(0^) + T^-l(-.o)(x) dx 

with both families {P^\o^) satisfying (i)-(iii). Then the theorem above 
holds for both these families, with exactly the same proof. 

Remark 4. As said before, any Levy process X whose Levy measure F 
is in 7^ satisfies Assumption 1, but the converse is far from being true, 
so, even for Levy processes, the identifiability question is not completely 
solved under Assumption 1. More precisely, as the estimation results will 
show below, (12) implies the "positive" identifiability results [(i) and the 
first part of (ii) of Theorem 1] for Levy processes, but not the "negative" 
results [second part of (ii)]. 

(3) 

For example, consider the class 7^ of all measure of the form 



F(dx) = -^^-l(o,i] (x) dx + G(dx) with G = a 2 ^ ^i/ n i//3 2 {dx) 

n>l 
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and < fa < fa < 2 and ai,a2 > 0. Any such F satisfies (33), but not (11). 

(3) 

On 7^ , all four parameters fa, fa, oi, 0-2 are identifiable without the restric- 
tion fa > fa/2. This is of course due to the fact that the measure G is singu- 
lar, and any two measures G and G' of the same type with {fa, ^2) 7^ (/^2> °2) 
have a Lebesgue decomposition G' = g • G + G with G /_L (R) = 00 when 
fa 7^ P'2 an d a(G, G") = 00 when fa = /3 2 and a 2 7^ a' 2 . 

We emphasize again that this example is quite singular, and verify here 
the fairly general principle that the less regular a statistical problem is, the 
easier it is to solve in the sense that more parameters can be estimated, and 
often with faster rates. 

Remark 5. The class may be bigger than T* , but it is very far 
from containing all possible Levy measures. Indeed, any decreasing right- 
continuous function / on (0, 00) with f(x) — > as x — > 00 and f(x) < K/x a 
for x £ (0, 1], for some constants K > and a £ (0,2), is the symmetrical 
tail f{x) = F{[— x,x] c ) of a Levy measure, although of course it does not 
need to be equivalent to a/x" as x — > for some (3 £ (0, 2) and a > 0: so (6) 
may fail even with j = 1. 

4. Discretely observed semimartingales: Preliminary estimators. Now 

we turn to the more general case of semimartingales. The process X is 
observed at the times iA n for i = 0, 1, . . . , [T/A n ] (where [x] denotes the 
integer part of the real x). We thus observe the increments 

( 13 ) A"X = X iAn - X(j_ 1)An . 

The BG indices describes some properties of jumps, which are not ob- 
served. However, when an increment is relatively large, say bigger 
than u n with u n ^> y/ A n , it is likely to be due to jumps because the drift 
plus the continuous martingale part have increments of order of magnitude 
\/ A n . Moreover it turns out that it is usually due to a single "large" jump 
of size bigger than u n , although of course the observed value A™X is not 
exactly the jump size. So one may expect the number of jumps with size 
bigger than u n = u, over the time interval [0, t] , to be the following number, 
or be relatively close to it: 

[t/An] 

(14) U(u,A n ) t = 

i=l 

In order for the previous statement to actually be true, we need some addi- 
tional assumptions, though. Those are given in the following: 

Assumption 2. The process X is an Ito semimartingale, and: 
(a) The processes bt, ct are locally bounded. 
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(b) We have Assumption 1 with A\ = J Q a\ ds for i = 1, . . . , j ' + 1, where 
the processes a 1 are locally bounded. 

(c) We have ft > ft/2. 

Assumption 2(c) above may look strange, or too strong. However, in 
view of the identifiability results of the previous section, we cannot esti- 
mate consistently ft if it is strictly smaller than ft/2, and as a matter of 
fact, the estimators described below are consistent only if ft > ft/2. Hence, 
since Assumption 1 for j implies the same for all j' < j, (c) above is re- 
ally not a restriction, but amounts to replacing j in this assumption by 
j Asup{i:ft >ft/2}. 

Apart from (c), this assumption is satisfied in Examples 1 and 2, and also 
by any Levy process satisfying Assumption 1. 

The estimation procedure is a two-step procedure, and in this section we 
describe the first — preliminary — estimators. These estimators will be con- 
sistent, but with very slow rates of convergence. This is why, in the next 
subsection, we will derive final estimators which exhibit much faster (al- 
though still slow) rates. 

Those preliminary estimators require the knowledge of a number e > 
which satisfies 

(15) i = l,...,j-l ft-ft+i>£. 

Such an e always exists, but here we suppose that it is known, somewhat 
in contradiction with the fact that the ft are unknown. It it is obviously 
quite difficult to estimate properly two contiguous indices ft and ft + i when 
they are very close to to one another. So from a statistical viewpoint, the 
assumption ft — ft+i > £ for some fixed e > is natural. Moreover, since we 
do not know a priori which cj is observed, this amounts to supposing that 
all possible values of the BG indices in the model satisfy this restriction. For 
models used in practice, this is not really a restriction since these models rely 
on at most a small number of indices that are separated from one another. 

The key ingredient for constructing the estimators is the counting process 
defined in (14), evaluated at the terminal time T and for suitable values of it. 
In particular, we choose a sequence u n satisfying 

u n -> 0, A p n < Ku n 

(16) 

Of course p > above (otherwise u n — > would fail). The infimum of the 
upper bound for p over all ft < 2 is 2/11. Therefore, since we do not a priori 
know the values of ft, whereas as we will see the rates improve when the 
sequence u n becomes smaller (termwise), it is thus advisable to take p = 2/11 
above. 
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The first-step estimation is done by induction on i. We choose 7 > 1, and 
the estimators for j3i and A\ are 

{ log(U(u n ,A n ) T /U(>yu n ,A n ) T ) 

on J ; , if U [JUn, A n )T > 0, 

Pi = < log7 

I —1, otherwise, 

(17) 

r? = (u n f?U(u n ,A n ) T . 
For constructing the subsequent estimators, and with e in (15), we set 

(18) Un^U^y- 1 

(so n nj i = u n ). We denote by I(k, I) the set of all subsets of {1, ... ,k} having / 
elements. Assuming that we know /3" and r™ for i = l,...,k— 1, for some 
k £ {2, . . . , j}, we set 

fc-i 

x>\ => U n {k,x)=Y J (-V l U{x 1 l u n ^A n ) T Y, 7 E * 6J ^\ 

i=o Jei{k-i,i) 

m S H™' if ,»(M»o,^, 7 )>o, 

-1, otherwise, 



n=uJAu( Un , k ,A n ) T -j2^<^ 



k-l 



1=1 



Finally, in order to state the result, we need a further notation, for i = 
1, . . . , j — 1 (so when j = 1 the following is empty): 

m H = 4 +1 nu7 ft - ft+i -i) 

{ ) Afp log 7 nt!(7 A - ft -l) ' 

Theorem 2. Under Assumption 2 and (15), for ali i = 1, . . . ,j — 1 suc/i 



Moreover if n = fij — fij+i V % > 0, i/ie following variables are bounded in 
probability: 
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The estimator /3™ is exactly the estimator proposed in Ai't-Sahalia and 
Jacod (2009a) for the leading BG index f5\. So, not only does it satisfy (21) 
when j > 2 or the tightness of (22) when j = 1, but it also enjoys a cen- 

3 /2 

tral limit theorem centered at j3\ and with rate u„ as soon as /?2 < /3i/2 
(this property implies j = 1 here). Moreover, in this case one could prove 

' — ' 3 /2 

that r™ also satisfies a CLT with the rate u„ ' log (1 /«„,), although we 
will not prove it, since the emphasis here is on the case of several BG in- 
dices. 

Some remarks are in order here: 

Remark 6. It is possible for the estimator to be negative, in which 
case we may replace it by a or by any other positive number. It may also 
happen that the sequence /3" is not decreasing, and we can then reorder 
the whole family as to obtain a decreasing family (we relabel the estimators 
of A l T accordingly, of course). All these modifications are asymptotically 
immaterial. 

Remark 7. As mentioned in the Extension 2 at the end of Section 2, 
we can relax A l T > in Assumption 1. Then the above theorem is still valid, 
in restriction to the set Qt({W, ■ ■ ■ , lm}) of (7), as soon as /3/ m > [3^/2. 

Remark 8. Suppose that j > 2. The limits in (21) are pure bias, hence 
precluding the existence of a proper central limit theorem. Note that Hi > 
if i < j, so the bias for /3" and for T" are always negative and positive, 
respectively. 

Note also that the rate of convergence for estimating /3j when i < j — 1 , say, 
is n^'j that is Un ^ +1 '^ E ' 2 ^ . This is exceedingly small, indeed. For 
example, suppose that we have three indices (3\ > P2 > /?3 > 2 • Then (15) 
implies necessarily e < ^ , so the best possible rate for i = 2 would be less 

than, but close to, Un ^ 3 ^ 1 ^ 4 ) upon taking e close to ^r, which is of course 
impossible because we do not know /?i to start with. 

In the previous example, if we suspect that fii is bigger than 1, say, it 
becomes (perhaps) not totally unreasonable to choose e = 0.1; the rates for 
i = 2 and i = 3 thus become Un ^ 3 ^ 10 and u^ 3 ' 3l / 2 )/ 100 - This is of course 

2/11 

on top of the fact that, because of (16), u n is of order of magnitude A n , 
by a conservative choice of p. 

Practical considerations. Letting aside the slow convergence rates, the 
previous result suffers from two main drawbacks: 

(1) It requires to know the number of indices to be estimated (this is 
implicit in Assumption 2). 

(2) It requires to know a number e > satisfying (15). 
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About the first problem above, in real world one does not know the num- 
ber of indices. On the other hand, if Assumption 1 holds, it seems rea- 
sonable to suppose that it holds for all j, whereas the estimation is made 
for those which are bigger than /?i/2 only. In connection with this, we 
assume — /3j+i > e for all i < j := sup(/c:/3fc > /?i/2), plus the property 
Pj > P\/2 + e. Then the aim becomes to estimate /3j and A l T for all i < j, 
with j unknown. 

Since the estimation procedure is done by induction on the successive 
indices, one can start the induction as described above, and stop it at the 
first i such that < e + (3\/2. Asymptotically, this procedure will deliver 
the "correct" answer (the proof of this fact, not given below, is a simple 
extension of the proof of the second claim of the theorem). In practice, 
however, the solution to this stopping problem is not quite clear, since in 
particular the estimated sequence /3j is not necessarily decreasing, although 
it is so asymptotically. 

Problem 2 above is clearly more annoying. We have to admit that, in 
the setting presented here, we have no theoretical solution for solving it. 
A possible way out would be to make the estimationjwith^ several values 
of e, going downward, until the estimated differences fii — /3j_i all become 
significantly bigger than the chosen e, but no mathematical result so far is 
available in this direction. In addition, since rates are very slow, the proba- 
bility that such a difference is bigger than e when the true values satisfy the 
same inequality may be not close to 1 (for finite, but even large, samples). 

Nonetheless, bad as it looks, this condition is probably relatively innocu- 
ous in practice: indeed, when two successive indices are very close to each 
other, they are obviously very difficult to tell apart. So the problem is practi- 
cally meaningful only if the indices are a small number (as 2, 3 or perhaps 4) 
and reasonably well separated. Hence taking e = 0.1 for instance, as in Re- 
mark 5, seems to be safe enough. 

5. Discretely observed semimartingales: An improved method. The ob- 
servation scheme is the same as in the previous section: X is observed at 
the times iA n smaller or equal to some fixed terminal time T. 

As already mentioned, the previous estimators converge at a very slow 
rate, especially for higher order indices; see Remark 8. So, in order to imple- 
ment the estimation with any kind of reasonable accuracy, it is absolutely 
necessary to come up with better estimators. 

This is the aim of this section. Assuming Assumption 2, we also suppose 
that we can construct preliminary estimators, such as in the previous section. 
Exactly as there, we must know the number j of BG indices that are to be 
estimated. 

The method consists in minimizing, at each stage n, a suitably chosen 
contrast function $ n . First we take an integer L > 2j and numbers 1 = v\ < 
V2 < • ■ • < vl- We also choose positive weights (typically = 1, but any 
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choice is indeed possible), and we pick truncation levels u n satisfying (16). 
We also let D be the set of all (x,, 7i)i<i<j with < Xj < x.,_i < • • • < X\ < 2 
and 7i > 0. Then the contrast function is defined on D by 

L / j \ 2 

7i \ 



(23) $ n (xi,7i,...,a:j,7j) = ^^4 U ( v l u m A n)T - ^ 



z=i 



^ (viu n ) x ' 



where the sequence u n satisfies (16). Then the estimation goes as follows: 

Step 1. We construct preliminary estimators /3" (decreasing in i) and 
(nonnegative) for and ^ for i = 1, such that (/3j — f3i)/u n and 

(r^ — j4^)/nn go to in probability for some 7] > 0. For example, we may 
choose those described in the previous section (see Remark 6): the consis- 
tency requirement is fulfilled for any rj < (e/2) J . 

Step 2. We denote by D n the (compact and nonempty) random subset 

of D defined by D n = {(xj,7i) & D:\xi — fif \ < au n , (7, - Tf | < an^,Vi_ = 
l,...,j}, for some arbitrary (fixed) a > 0. Then the final estimators f3f 
and r™ will be 

(24) (Pi,r?)i<i<j =argmin$ n (xi,7i,...,x i ,7 i ). 

D n 

Theorem 3. Under Assumption 2, and for all choice of V2, • ■ ■ ,vl out- 
side a \L~\-null set (depending on the fii 's; Xi is the I -dimensional Lebesgue 
measure), the sequences 

~fi n — B- T n — T- 

(25) 



are bounded in probability for all i = 1, . . . ,j and all /j, > 0. 

The rates obtained here are much faster than in Theorem 2: we replace 
u ni ^ I+lV ^ 1//2 ^ by Un ^ , for two reasons: the exponent /3, — is big- 
ger than fii — /3j + i V (/?i/2), unless i = j; more importantly, we replace the 
auxiliary truncation levels u n ^ of (18) by the original sequence u n , which is 
much smaller when i > 2, and only subject to (16). We will examine in the 
next section how far from optimality those rates are. 

Remark 9. As stated, and as seen from the proof, we only need L = 2j, 
and choosing L > 2j does not improve the asymptotic properties. However, 
from a practical viewpoint, it is probably wise to take L bigger than 2j 
in order to smooth out the contrast function somehow, especially for (rela- 
tively) small samples. A choice of the weights wi > other than wi = 1, such 
as wi decreasing in I, may serve to put less emphasis on the large truncation 
values u n vi for which less data are effectively used. 
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Remark 10. The result does not hold (or at least we could not prove it) 
for all choices of the «j's, but only when (i>2, . . . , v£) (recall V\ = 1) does not 
belong to some Lebesgue-null set G{fa, . . . ,/3j). This seems a priori a serious 
restriction, because (fa, . . . ,fa) is unknown. In practice, we choose a priori 
(v2, ■ ■ ■ ,vl), so we may have bad luck, just as we may have bad luck for the 
outcome uj which is drawn. . . . 

We may also do the estimation for a number of different choices for the 
weights and/or values of L > 2j and compare or average the results. This 
should contribute to weaken the numerical instability inherent to minimiza- 
tion problems such as (24). This numerical instability is similar to the one 
occurring in nonlinear regression problems. 

We have to state, however, that these problems, just as those stated in the 
"practical considerations" of the previous section, are not fully addressed in 
this paper, and they are probably quite difficult to overcome. Our emphasis 
here is more on theoretical results, and on the possibility of performing the 
estimation with reasonable rates (see, however, Section 7 below, to see how 
the problem of finding a "good" e and doing preliminary estimation in our 
simulation study is skipped, without affecting the quality of the procedure 
in any noticeable way). 

6. Optimality in a special case. 

6.1. Why the convergence rates are necessarily slow. Intuitively, the fact 
that we are right at the boundary between identifiability and lack thereof 
suggests that we should expect the rate, as we approach the loss of identi- 
fiability boundary, to deteriorate all the way to zero. In order to quantify 
precisely how slow the rates of convergence for the estimators of the second 
(and higher) index must be, even in ideal circumstances, we study a simple 
parametric model of the following form. Let W be a Brownian motion and 
Y , Y 2 be two independent standard symmetric stable processes, and set 

(26) X t = bt + aW t + Y t l + Y t 2 . 

Each Y l depends on two parameters, the index fa and a scale parame- 
ter Gtj , the latter being characterized by the fact that the Levy measure of Y l 
is 

(27) F\dx) = -^ r dx. 

\x\ +p i 

We have six parameters, 

(28) 6GK, c = cj 2 >0, oi,O2>0, < fa < fa < 2, 

among which b is not identifiable, and c,fa,a\ are identifiable, and (fa,a 2 ) 
are identifiable if and only if fa > /?i/2. In what follows, we restrict our 
attention to the four parameters fa, fa ,0,1,0,2 ■ 

In order to find at which rate it is possible to estimate these four param- 
eters, when X is observed at the discrete times (iA n : i = 0, 1, . . . , [T/A n ]) 
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and A n — > 0, we study the behavior of the Fisher information matrix. Due 
to the fact that X is a Levy process, the information matrix at stage n is 
[T/A n ] times the information matrix obtained when we observe only the 
variable Ja„; since the variable admits a density x \- > p(<\(x\c, ft, a\, 
ft, 02) which is C°° in x, and also in (c, ft,ai,ft,a2) ° n the domain defined 
by (28), it is no wonder that Fisher's information Ia for a single observa- 
tion X& (recall Xq = 0) exists, and we can study its behavior as A — > 0. 

Only the diagonal entries are important for the various rates of conver- 
gence, so we only need to focus on the following diagonal entries of this 
matrix: 

t-/3i/3i jaiai jfcfo jaiai 
l A ' J A > J A > J A 

The main result of this section follows, giving the asymptotic order of the 
relevant terms in Fisher's information: 

Theorem 4. We have the following equivalences, as A — > 0: 
~ TT7^W^A 1 ^/ 2 (log(l/A)) 2 -^/ 2 , 



aiai 



2(2-/3i)/ 3 i/2 c /V2 

2ft C/3l af A 1 ^/ 2 



A (2 - /3i)/V2 a /3i a 2 (^(l/A))/ 3 !/ 2 

and also, provided ft > ft/2, 

W3 2 ft> a 2^2 



2a 1 /3 1 (2/3 2 - ft) (2 - foh-fkfrcfa-M* 
x A^+^OogU/A)) 2 "^!/ 2 , 

OR 2 Al-/3 2 +ft/2 
ra 2 a 2 *P2 ^ 



[ A 



aift(2ft - ft)(2 - pJfo-Pi/Zcfh-Pi/* (log(l/A))^-ft/2 ' 



Remark 11. We are not concerned here with the identification and 
estimation of the volatility parameter c; the term 1^ in a simpler model has 
been studied in A'it-Sahalia and Jacod (2008), as well as I^ ai when a 2 = 
(i.e., when there is only one stable process on top of the Brownian motion). 
The asymptotic equivalent for the term I°£ ai of course reduces to (4.11) of 
that paper, with a = ft, j3 = 2, 9 = ai, up to a change of parametrization 
for cti , since here we use the parametrization (27) which corresponds to the 
notation of Assumption 1, which is fulfilled here. 

Coming back to the original problem, we deduce that it should be possible 
in principle to find estimators ft 1 and a? having the following properties: 



(log(l/A n )) 1 -ft/ 4 / ~ ON c 



*0i/4 



(ft n - ft) ^N(0,1/T1^) 
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(a? -a x ) AAA(0,l/TX aiai ), 



t A^ l/4 (log(l/A n ))ft/4 
(29) 

HoffCl/A ^1-/32/2+^1/4 _ 



A /3 2 /2-/3i/4 



.(a--a 2 )^Af(0,l/Tl a ^), 



A^ /2 - ft/4 (log(l/A n ))^/2-ft/4 

where TP 1 * 1 , T aiai , T^ 2 ^ 2 and X" 2 " 2 are the constants in front of the term 
involving A in the equivalences above, for lJ l/3 \ I% ai , I^ 2 and /^ 2Q2 , 
respectively. Conversely, by the Cramer-Rao lower bound, Theorem 4 also 
implies that it will be impossible to find consistent estimators with faster 
rates of convergence, or smaller asymptotic variance, that those exhibited 
in (29). 

Note that these rates are consistent with the results of Theorem 1. The 
first two convergences above shows that it is always possible to estimate 
consistently fa and a\ , the third one implies consistency for fa only if fa > 
fa/2, and the last one implies consistency for a 2 only if fa > fa/2. The 
last statement seems contradictory with Theorem 1 when fa = fa/2, but 
of course it is possible to have a (somewhat irregular) statistical model for 
which consistency holds even though the Fisher information does not go to 
infinity. 

6.2. Comparison of rates. Now, we can compare these optimal rates with 
the rates obtained in Theorem 3. Doing as such, we compare a semipara- 
metric model with a parametric sub-model. However, a minimax rate for 
a given parameter in a semiparametric model cannot be faster than the 
rate obtained for any parametric sub-model, hence the previous results are 
bounds for the rates in the general model considered in this paper. 

Neglecting the logarithmic terms, and considering only the estimation 
of fa for i = l,2, the rates above are Aj, whereas in Theorem 3, and upon 

choosing u n optimally [i.e., p as large as possible in (16)], they are A 5 
where 

TirTfl-e, ^ &<(V97-l)/6« 1.475, 

* + Pi 

and e > arbitrarily small (and if fa > fa/2 when i = 2). 

As it should be, we have ji < 7^, and if equality were holding we would 
conclude that our estimators achieve the minimax rate (up to A~ e , of course, 
but e is arbitrarily small) . What one can say is that the actual minimax rate 



2 fa - fa 



1 



H = a » li 
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lies somewhere in between these two values, and the ratio 71/7^ is a kind of 
(imperfect) measure of the quality of the estimators proposed in Section 5: 
the closest to 1, the closest to optimality. Then we can conclude the following: 

(a) This ratio is the same for j = 1,2, which is an a priori surprising result: 
the quality of our estimator for P2, relative to the optimal estimators in the 
stable sub-model, is the same as for [3%. 

(b) This ratio is close to 1 (near optimality) when f3\ is small, and de- 
creases down to 4/11 as (3\ increases up to 2. The worst value is small, 
but not catastrophically such, especially in light of the fact that we are 
considering semiparametric estimators whereas the rates are optimal in the 
parametric context (i.e., assuming additional structure). 

7. Simulation results. We now provide some simulation evidence regard- 
ing the estimators in the case where j = 2; we are attempting to estimate 
the first two jump activity indices of the process f3\ and /?2- The data gen- 
erating process is a stochastic volatility model for X% with jumps driven by 
two stable processes Y l and Y 2 , with W,Y^,Y 2 independent below: 

(30) dX t = a t dW t + 9 1 dY t 1 + e 2 dY t 2 

with a t = v] /2 , dvt = k{t) - v t ) dt + -yv^ 2 dB t + dJ t , E[dW t dB t ] = pdt, ?? 1/2 = 
0.25, 7 = 0.5, k = 5, p = —0.5, the volatility jump term J is a compound Pois- 
son jump process with jumps that are uniformly distributed on [—0.3,0.3] 
and intensity A = 10 and Xq = 1. Recall that the second component can 
be identified only if /?2 > /?i/2. We consider the situation where (Pi, fa) = 
(1.00,0.75). 

Given r\, each scale parameter 9{ (or equivalently A l T ) of the stable process 
in simulations is calibrated to deliver different various values of the tail 
probability P, = P(|AY?| > 4r/ 1 / 2 An /2 ). In the various simulations' design, 
we hold 77 fixed and consider the cases where Pi = 0.05 and P2 = 0.005. We 
sample the process X over T = 21 days (6.5 hours per day) every A n = 0.01 
second. This results of course in a number of observations (nearly 5 x 10 ) 
that is unrealistically high for most high-frequency financial data series, at 
least presently, but extremely large numbers of observations are needed if we 
are going to be able to see the component of the model "behind" the two 
components with indices of activity 2 (the continuous component) and fii 
(the most active jump component). Of course, much smaller datasets would 
be sufficient in the absence of a continuous component. 

Note that in general, and besides the preliminary estimators /3" and Tf, 
we need to choose the number a > coming in the definition of the set D n . 
Since in practice n (or A n ) is given, we need to choose in fact the number 
aun- So in concrete situations one probably can forget about the preliminary 
estimators and take a domain D n which is the set of all (xj,7j) in D with 
7« < ^4 for some "reasonably chosen" A, or even A = 00. 
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(31) 
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) 
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where the cutoff levels viu n are chosen in terms of the number a\ of the 
long-term standard deviation ^r/A n over a time lag A n of the continuous 
martingale part of the process: we take ai to be {7, 10, 15, 20} and multiples 
{2,4,6} thereof (giving all together L = 10 distinct values). Here we know r\: 
we could also estimate for each path the average volatility, using truncated 
estimators for the integrated volatility [see, e.g., Mancini (2004) and A'it- 
Sahalia and Jacod (2009b)]. 

The optimization problem (31) is a quadratic problem similar to classi- 
cal nonlinear least squares minimization. In situations where the parameter 
space is high dimensional, the objective function can exhibit local extrema, 
which can make the search for the optimal solution time-consuming as many 
starting values must be employed to validate the solution. In the case of the 
application here, we are only including 4 parameters, and for this small di- 
mension, this is not causing many difficulties. In any case, it is unlikely, 
given the slow rates of convergence, that one would want to go beyond the 
second index fa, in practice. 

The results in Figure 2 are obtained with M = 1000 simulations: the 
estimators appear to be reasonably good, but then again this is for an un- 
realistically large number of observations, at least from the point of view 
of financial applications; it is perhaps feasible in other applications, such as 
Internet data traffic or wind measurement. 

8. Conclusions. This paper determined theoretically what the successive 
BG indices are and how they are identified, including the perhaps surprising 
theoretical bound on the identification of the successive indices as a function 
of the previous ones. This result clarifies the border between the aspects of 
the jump measure which are identifiable from those which are not on the 
basis of discrete observations on a finite time horizon. Beyond the leading 
index, the identification requires in practice vast quantities of data which 
are out of reach of financial applications at present but may be relevant in 
other fields (such as the study of turbulence data, or Internet traffic). We 
showed through explicit calculations of Fisher's information that this limi- 
tation is a genuine, inescapable feature of the problem. There are a number 
of important questions that this paper does not touch upon: central limit 
theorems for the estimators, estimators that achieve the optimal rates of 
convergence, estimators that are robust to microstructure noise, estimators 
that are applicable with random sampling intervals, among others. The issue 
of the optimality of the rates in general remains an open question. 
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Fig. 2. Monte Carlo simulation results: estimators ft™ (upper left graph), fiQ (upper right 
A^ n (lower left graph), A T ' n (lower right 



APPENDIX: PROOFS 

We use the following notation throughout the Appendix. First, K denotes 
a constant which may change from line to line, and may depend on the 
characteristics or the law of the processes at hand. It never depends on n, 
and it is denoted as K p if it depends on an additional parameter p. Second, 
for any sequence Z n of variables and any sequence v n of positive numbers, 

( Op(v n ), if Z n /v n is bounded in probability, 
(32) Z n = < P 

lo P (v n ), ifZ n /v n — >0. 



APPENDIX A: PROOF OF THEOREM 1 

(1) We fix F G T 3 {1) , with F given by (11). We also consider another 

F' G T 3 , with F' given by (11) with f3[, o! i and F'. As said before, it is 
not a restriction to assume the representation (11) with the same rj > for 
both F and F' . Set 

(33) j=inf(l<l:(A,ai)^(#X))- 

The result amounts to proving the following two properties, with j as above 
and b,b' G R and c,c' > 0: 

(34) Pj>Y => Qb,c,F±Qb>,ci,F>, 
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(35) B<^ => f^b"eW,3F"eT^ 

2 \ with F" = F' on [-77,7?) and Q b ,c,F I Qb" ,c,F> ■ 

These conditions being symmetrical in .F and F' , in both (34) and (35) we 
may assume 

(36) either f3j > /3'- or (3j = f3j and aj > a^- . 
(2) In this step we assume (36). We set 

F(dx) = jS^h-vri ( x ) dx > F'( dx ) = T^T^M^v] ( x ) dx - 

i>l i>l I I 

Then F' = f»F, where / = £ (with g = 1) and g = H + G and r/ = H + G" 
and 

i=l 1 1 i>j 1 1 

On [—77, ?/] we have / — 1 = ^"^f and 
G(x) - G"(x) 



By virtue of (ii), (iii) and (iv) of (11), and of (36), we then deduce that 

( A-lxf 1 -^ < \f(x) - 1| < A+\x\h-h, 
(37) x€(-e,e) < A_ A + 

I | x |l+/3i - 

for three constants A + > A_ > and e G (0,77), depending on the two se- 
quences {(3i,cii) and (/3-,a-). 

(3) Now we prove (34). Since a(F, F') > a(F, F'), it is enough to show that 

a(F, F') = 00. By (37), \f(x) - 1| < 1 when x G (s\ ef) for some e' G (0,e]. 
Thus 

a(F,F')> J \f{x)-l\ 2 g{x)dx>A 3 _ J ^f 1 ' 2 ^' 1 dx. 

The last integral is infinite when /3,- > /3i/2, and (35) follows by (9). 

(4) Finally we prove (35). Recall that F = F + F and F' = F' + F' . The 
measure F" = F' + F is obviously in 7a and satisfies F" = f • F. Since 
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f(x) = 1 outside [—77,77], the quantity a'(F,F") introduced in (9) is 

a'(F,F")= [ V x(f(x)-l)g(x)dx 
J-n 

< A lJ \x\~^dx+(^J\J J\x\\f(x)-l\g(x)dx, 

which is finite by (37) (because f3j < /3i/2 < 1) and because / and g are 
bounded on [e,rj] U [—77, — e]. Therefore the number b" = b — Jq M x(f(x) — 
l)g(x)dx is well defined. Now we consider the two triples (b,c,F) and 
(b", c, F"). From what precedes they satisfy the first and the last three prop- 
erties in (9). We also have by (37) 

a(F, F") = f (\f(x) - 1| 2 A |/0r) - l\)g(x) dx 
J-n 

< A\ j £ (Ixl-ft- 1 A {A + \xf'- 2 ^- 1 )) dx 

+ + j £ y\f(x)-l\ 2 /\\f(x)-l\)g(x)dx. 

Since (3j < /?i/2 and that / and g are bounded on [£,17] U [—77, — e], we deduce 
a(F, F") < 00. So all conditions in (9) are satisfied, and we have proved (35). 

APPENDIX B: COMPARING BIG JUMPS AND BIG INCREMENTS 

Before starting, let us mention that for the proofs of Theorems 2 and 3 one 
may use a localization argument which allows us to replace Assumption 2 by 
the so-called "strengthened Assumption 2," which is the same except that 
all processes bt, ct, a\ are bounded, as well as the process A 3 t +1 and X t itself. 

In this section we compare the number of "large" increments of X with 
the number of correspondingly large jumps, that is, the numbers 

(38) V{u) t = Y,M\±x s \>u}- 

s<t 

We will indeed show that the difference U(u n , A n )-r — V(u n )T is negligible 
for our purposes, when the sequence u n satisfies (16). The reason for doing 
this is that the analysis of the processes V(u n ) is an easy task. Indeed, as 
soon as u n — > 0, 

(39) V(u n ) T -A(u n ) T = Op(u-^ 2 ). 

To see this, we observe that each process M n = Un (V(u n ) — A{u n )) is 
a quasi- left continuous, purely discontinuous, martingale with jumps smaller 
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than Un , which goes to 0. Its predictable quadratic variation is (M n , M n ) = 
Un 1 A(u n ), which by (6) converges for each t to Aj. Since further A 1 is a con- 
tinuous process, it follows from Theorem VI. 4. 13 of Jacod and Shiryaev 
(2003), for example, that the sequence M n is C-tight (and even converges 
in law), so a fortiori, (39) holds. 

The main result of this section is the next proposition: 

Proposition 1. Under the strengthened Assumption 2 and if the se- 
quence u n satisfies (16), we have 

(40) U(u n , A n ) T - V(u n ) T = -\o P (u^~^ +1 + u^ 2 ). 

The proof is based on a series of lemmas. The constant K may depend 
on an implicit way on the bounds in this strengthened assumption, but not 
on the two numbers u,r£ (0, 1) which are fixed in most of this section. 

With any cadlag process Y and u 6 (0, 1] , we associate the process and 
the variables 

(41) Y{u) t = ]T AY s l {lAYsl>u} , C(Y, u)2 = l { | Ar y|> M} - 

s<t 

For simpler notation, we denote by E™_ 1 and Pf_ 1? respectively, the condi- 
tional expectation and conditional probability, with respect to jF(i-\\& n - 

Lemma 1. For all u,r £ (0,1] with u r < 1/3, all w G (0,1/3) and all 
k>l, we have 

(42) P?_i(AJV(u) > k) < (KA n u^) k , 
P?_iMl - w) < A?X(u 1+r ) < u(l + w)) 

(43) 

< K{A n u~^w + A n u~^+ 1 + A 2 n u~^ 2+r ^ + Alu-^ 3+3r ^). 
Moreover there is a 7 > such that, if 

(44) A n <yu Pl ( 1+r \ 
we have for all u G (0, 1] 

(45) E?_ 1 (|C(AV +r ), «)? - &?V(u)\) < K{A 2 n u-^ 2+ r) + d^uT^^). 

PROOF. If D C R the compensator of the process N(D) t = J2 s <t ^d{AX s ) 

is Jq F s (D) ds. Our strengthened assumption implies the existence of a con- 
stant 9 such that F S (D) < 4>(D), where 

(6u-^, if Dc [-u,u] c , 

<j)(D) = < 9(u-^ 1 w + u-^+ 1 ), if D = [-u(l + w),-u)U(u,u(l + w)], 
{ 0<w<l. 
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Then for any finite stopping time S we have 

E(N(D) s+t - N(D) S | Ts) < 14(D). 

Let S(D)o = (i — l)A n and S(D)i,S(D)2, ... be the successive jump times 
of N(D) after time (i — l)A n . What precedes implies that for k > 1 and on 
the set {<S(L>) fc _i < iA n }, 

PiSMjKiAn I J' 5(D)fc _ 1 )<E(iV( J D) 4An -iV( J D) 5(D)fc _ i | J^^A^). 

An induction on k yields the following, which gives us the first part of (42): 

(46) Ff^Af N(D) >k) = Fti(S(D) k < iA n ) < {A nl {D)) k . 

In the same way, if D n D' = 0, the set {AfN(D) > k, AfN(D') > 1} is 
the union for I = 1, . . . ,k + 1 of the sets T; = {5(D)j_i < S(Z?')i < 5(Z>)j < 
zA n }, whereas 

P^i^CD),-! < S-pOi < S(D) t < iA n ) 

= K^(l sm _ 1<s{DI)l<lA F(N(D) tAn - N(D) siD , h >k-l + l\ T s(d) 1 )) 

< (A n 0(D)) fe - z+1 P7_ 1 (5'(L')i_ 1 < < ^„) 
= (A n 0(D)) fc -' +1 

x Ef_ 1 (l s(D)i _ 1<iAn P(iV(L>O i A„ - iVpOs^'O^a > 1 1 ^(D)J) 

< (A n ^»(D)) fe - z+1 A ri 0(D')P?_ 1 (-S'(D)i_ 1 <iA n ), 

where (46) has been applied twice. Another application of the same then 
yields 

D n D' = => P"_ 1 (A^(L>) > fc, A^N(D') > 1) 

(47) 

<(fc + l)A*+VP)V(£>')- 

Next, let w € (0,1/3]. By convention (a, 6] = when a>b below. If 
u(l — w) < A^X(u 1+r ) < u(l + w) we have four (nonexclusive) possibilities: 
either A"iV((u 1+r , oo)) > 3, or A?N((u 1+r ,u/3]) = Af iV((u/3, oo)) = 1, or 
A"iV((u/3,oo)) =2, or A^iV((u(l -w),u(l + w)]) = 1. We an analogous im- 
plication if — u(l + u;) < A"X(ii 1+r ) < — u(l — w). Then (43) easily follows 
from (46) applied with D = [— u 1+r , u 1+r ] c , with D = [— u/3,u/3] c and with 
D = [—u(l + w), —u(l — id)) U (u(l — w),u(l + w)], and from (47) applied 
with D = (-u/3, -u 1+r ) U (u 1+r , u/3] and D' = [-u/3, u/3] c . 

Finally we prove (45). Let H = \((X(u 1+r ),u)f - A?V(u)\ and D = [-u/2, 
-u l+r ) U (u 1+r ,u/2] and D' = [-u/2,u/2] c and D" = D U D'. From what 
precedes, we have 

^(AfNiD") >k)< (9A n u-^ 1+ ^) k , 
(48) P"_i ( A?N(D') = 2) < 2 A 2 u~ 2/31 , 

P"_ 1 (A[ l iV(Z)) = A?N(D') = 1) < 2 A 2 l n- /3l(2+r) . 
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We have H = on the sets {A^N(D") < 1} and {A?N(D") = A"N(D) = 
2}, and H < k - 1 on the set {AfN(D") = k}, for all k>2. Thus if v = 
6A n u~^ 1+r \ 

oo 

K-i{H) <J2kVti(&iN(D") > k)+¥2_ 1 (AfN(D') = 2) 

fc=3 

+ ¥^_ X {A^N(D) = AfN(D') = 1) 

oo 

< J2 kv k + e 2 A 2 nU - 2 ^ + e 2 A 2 n u-^ 2+ ^ 

k=3 

by (48). When v < 1/2, that is, when A n < yu^ ( 1+r ) for 7 = 1/20, we have 
Sfc^=3 kv k < -ftTv 3 , and the above is smaller than the right-hand side of (45). 
□ 

Lemma 2. Let q>2 and u,r G (0,1). As soon as (44) holds for some 
constant 7 > 0, we have 

(49) E(\A?(X - X{u 1+r ))\ q ) < K^AniA?/ 2 - 1 + u («-ft)(i+>0). 

Proof. Letting X c and [i be the continuous martingale part and the 
jump measure of X, we have X — X(u 1+r ) = B + B' + X c + M, where 

B' t = — ds xF s (dx), M t = / / x(ii — u){ds,dx). 

JO J {u 1 + r <\x\<l} JO J{0<|x|<m 1 +''} 

By the strengthened Assumption 2, for any y > the integral Jn x \ >y \ \x\F(dx) 

is smaller than K when /?i < 1, than -fTlog | when f3\ = 1, and than Ky 1 ^^ 1 

when /3i > 1. Therefore, since (44) implies 2/3i(l + r) > — 1) + we have 
|A™£?'| < -ftT 7 V ' A n . The strengthened Assumption 2 also implies |A™U| < 
KA n and, by well-known estimates about continuous and purely discontin- 
uous martingales [see, e.g., Ait-Sahalia and Jacod (2011)], we also deduce 
that 

E(|A[ l M'H < K q A n u {q - pi){l+r \ E(|A"X C H < K q A q J 2 . 
All these estimates readily give (49). □ 

Proof of Proposition 1. (a) It follows from (16) that u"n /A n — > 00, 
so for any 7 > (44) is satisfied for all r G (0, 1) and all n large enough. 
Hence, both estimates (45) and (49) hold, with constants K and K-y iQ inde- 
pendent of r, for all n large enough. 

The following inequality, where u,w G (0, 1) and x,y G M, is elementary: 

|l{a;+i/>«} — l{ai>u}l ^ ^{|^|>««'} + ^{u(l-w)<\x\ <u(l+w)} ■ 

We apply this with x = AfX(u 1+r ) and x + y = A^X and u = u n , and 
with w < 1/3 to be chosen later. In order to evaluate the probabilities for 
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having \y\ > u n w, respectively, u n (l — w) < \x\ < u n (l + w), we use (49) and 
Markov's inequality, respectively, (43). This gives that E(|C(X(u^ +r ), n n )" — 
C(X, u n )f\) is smaller, for all q>2, than 



Optimizing over w leads to take w = w n such that = Un ^ r + An 2 1 / 
uh~^ x , which is indeed smaller than 1/3 for all n large. Thus, putting the 
above together with (45), and recalling that A n < Kul/ P , we end up with 

(50) E(|c(x,Of - A?y«)|) < ^£<*, 

Un k=l 

where x^ = Xk(q,r) are given by 

_ qr - ftr _ g(l - 2ft - 2 + 2ftp 

Xl ~ q+1 ' X2 ~ 2p{q + l) 

1 2 

%3 = ft(l + r), x 4 = ft(2 + 3r), z 5 = ft - ft+i . 

(b) Now, for proving (40), it clearly follows from (50) that it suffices 
to show that one can choose q and r in such a way that x\~ > ft/2 for k = 
1,2,3,4. When q^ oo we see that x(l) -)■ x'(l) =r and x(2) -)■ x'(2) = 

so it remains to show that one can choose r £ (0, 1) such that x'{k) > ft/2 
for k = 1,2 and > ft/2 for k = 3, 4. Letting r be bigger than but as close 
as possible to ft/2, we deduce from (16) that such a choice or r is possible, 
and the proof is complete. □ 
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(1) In addition to the strengthened Assumption 2, we assume (15) for 
some e > 0. Theorem 2 says something about the estimators of ft and A l T 
only when ft > Moreover, if (6) holds for the sequence ft,...,ft+i, 
it also holds for the sequence ft, . . . ,/3y +1 , where j' = j if ft + i > 4p and 
j' = sup(i : ft > 4p) otherwise, and where ft = ft when i < j' and /3y +1 = 

ft'+i V Henceforth, upon discarding the indices such that ft < 4r, we 
can assume without loss of generality that 

(51) ft>--->ft>ft +1 = ^. 

Under this additional assumption, we have ft — ft < 1, and (18) yields 

(52) l< i <k<j => nJr^Mog-L = (nJ- &+1 ). 
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Moreover, combining (6), (39) and (40), we deduce that 

(53) U(v n , A n ) T = £ -J + Opiv-J' 2 ) 

i=l v n 

for any sequence v n such that v n < u n , and in particular for the sequences 
v n = Un,i- All of the proof will rely on this, and below Hi is always given 
by (20).' 

(2) We first consider the case i = 1 when j > 1. A simple calculation, 
based on (53) applied with v n = u n and v n = ju n , yields that in restriction 
to the set fi^, 

log(U(v n ,A n ) T /U(-fv n ,A n ) T ) = (ft -Hwfr-to) log 7 + o P (u^). 
This gives the first part of (21). It also implies that 

u Pl = u /3i e -(^-/3i)log(l/« n ) 

= u£(l + Hxvfc-b log(l/n n ) + opivfr-P* log(lK))). 



This and (53) yield the second part of (21). 

(3) Now we suppose that (21) holds for all i < k — 1, for some k G 
{2,..., j — 1}. We observe that we have the following identities, for all 
y = (yi,---,yk+i) and r = l,...,k + l: 
fe-i 

z=o Jei(k-i,i) 

k-i f0, ifr<fc-l, 

= JT(i_ 7 yi-Wr) = J G(fc,y, 7 ), ifr = fc, 
J=l lG'(A;,y,7), ifr = fc + l, 

where G(k,y,j) = Ui^ii 1 ~ l Vi ~ Vk ) and G'(k,y,j) = ntl C 1 " 7" +1 )- 
Therefore, (53) applied to u n = x^ l u n ^ and the definition of U n (k,x) yield 
for all x > 1 fixed, and with /3 = (ft, . . . , ft+i), 

^(fc,x) = g-^_g(-iy( 7 -^-7-^) £ ^ EjeJ ^ 

+ £ (7 E -^-7^) 

r=k x u n,fc i=0 Je/(fe-l,0 



+ 



■A-G(M,7) + -^ 
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The functions z h- > 7 _k are C°°. The induction hypothesis gives /3™ — /3j = 
P (uJ~ ft+1 ) for i = 1, . . . , k - 1. Then (52) and ft - &+1 > e allow us to 
deduce 

o<l<k-i,Jei(k-i,i) 7 Sie/^ _ 7 E i6 jft =0p (^ fc fc i - & + i ) ) 

l<r<fc-l 7 -^- 7 -W= 0p (nf7 /3 '' +1 ) = p(nJ fc - /3fc+1 ). 

Therefore we finally obtain 

xPk <% xPk+l <T 

(54) 

_ A T G(k^^) ( H k log 7 Ph _ p ( h-p k+1 A 



*n,fc 

where the last equality comes from the definition of in (20). Then exactly 
as in Step 2, a simple calculation shows the first half of (21) for i = k. 

For the second part of (21), and as in Step 2, we first deduce from the 
above that 

«2i = ^M 1 + tf*<V A-1 kg(V«n,*) + °P(<V A_1 Io8(V«n,*)))- 
Therefore it is enough to show that 

<"fc (u(Un, k )T ~ E F >n,f ) = ^ + OP^V"* -1 log(l/nn, fc )). 

In view of (53) with v n = u n ^ this amounts to proving for i = 1, . . . , k — 1, 

(55) - ^ugH* = p(nJ fc " & - 1 log(lK, fc )). 

The induction hypothesis yields that 

^ = «5iT*(i + Op(«n7 ft+1 log(lK )fc ))), 

f7 = ^ T + Op(nJ- ft+ Mog(lK, i ))- 

Then (21) readily follows from (18). 

(4) It remains to prove that the variables in (22) are tight. The difference 
with the previous case is that (54) no longer holds when i = j = 1 or k = 
j > 1, but it can be replaced by 

U (j,x) = — ^ (l + Op(u n 3 tj )). 

The rest of the proof goes unchanged [note that t] in (22) is (3j — 13/2 here]. 
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APPENDIX D: PROOF OF THEOREM 3 

We use simplifying notation: a point in D is 6 = (a:t>7i)i<i<j; an d we de- 
fine the functions F n j{6) = Yli=i 1i/( v i u n) Xi ■ The "true value" of the param- 
eter is 6o = (Pi,Ti)i<i<j, the preliminary estimators are 9 n = (/3™,r")i<j<j, 
and the final estimators are n = (/3™,r™)i<j<j. We set h n = log(l/u n ), and 
as in the previous proof we can assume (51). 

(1) We introduce some specific notation. For m > 2 we set G rn = (1, oo) m_1 , 
a point in G m being denoted as v = (v2, ■ ■ ■ , v m ). For 1 < k < j and v G G^fcj 
and with the convention v\ = 1, we let T,(y) be the 2k x 2k matrix with 
entries 

(56} E(v) =/^~ /3% if 1 <»<*;, 

1 ' {VKl \v~ Pi - k logv u \ik + l<i<2k. 

The aim of this step is to show that the set Z\~ of all v S G2k for which the 
matrix £(u) is invertible satisfies \2k{{Zk) c ) = 0, where A r is the Lebesgue 
measure on G r . 

When 1 < m < 2k and v € G2k, we denote by Ai m (v) the family of all 
m x m sub-matrices of the m x 2k matrix (Ti(v)i r :1 < I < m,l < r < 2k). 
A key fact is that Ai m (v) = M. m (v m ) only depends on the restriction v m = 
(v2, ■ ■ ■ ,v m ) of v to its first m — 1 coordinates. Moreover, E(u)ij equals 1 if 
i < k and otherwise: so the entries of the first column of any M G M m (v) 
are or 1, and M' m (v) denotes the subset of all M £ A4 m (v) for which 
M\ t i = 1 for at least one value of i. Finally, H m stands for the set of all 
v m € G m such that all M G Ai' m (y m ) are invertible. Since M^iv) is the 
singleton {S(u)}, we have = H2k- 

If m > 2 and v m = (t>2, . . . ,v m ) G G m and M G A^^(?J m ), by expanding 
along the last column, we see that 

k 

(57) det(M) = (a { + a k+t log v m ), 

i=i 

where each a r is of the form: either (i) a r is plus or minus det(M r ) for some 
M r G M m -i (v m ) (for m values of r) or (ii) a r = (for the other 2k — m values 
of r). Note that we can also have a r = in case (i), and since M G A4' m (v m ) 
there is at least one a r of type (i) with M r G M.' m _i(v m ). 

When at least one a r in (57) is not 0, the right-hand side of this expression, 
as a function of v m , has finitely many roots only, because all /Vs are distinct. 
Observing that M±(v) is the lxl matrix equal to 1, it follows that, with 
(v m -i,v m ) = (v2,.-.,v m -i,v m ) when v m -i = (v 2 , . . ■ , v m -i), and recalling 
that with our standing notation A2 is the Lebesgue measure on (l,oo), 

m = 2 => A 2 ((if 2 ) c ) = 0, 

(58) 

m>3, v m -i^H m _i =>• \ 2 {vm : {va m -i,v m ) £ H m ) = 0. 
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Since 

\m{{H m ) C )= / A 2 ( (rfu m _i), 

*^ Gm—1 

which equals / flm _ 1 Ai(u m : (u m _i, %) ^ iJ m )A m _i(rfu m _i) if A m „i((i? m _i) c ) = 
0, when m > 3, we deduce from (58), by induction on m, that indeed 
X m ((H m ) c ) = for all m = 2, . . . , 2k. Recalling Zf. = H 2 k, the result follows. 

Since the claim of the theorem holds for all (v2, ■ ■ ■ , v£) outside a Aj,- 
null set only, and L > 2k, we thus can and will assume below that the 
numbers vi are such that v<ik = (v2, ■ ■ ■ > ^2fc) £ ^fc, hence £(l>2fc) is invertible, 
for all k = 1, . . . ,j. 

(2) Our assumptions on the preliminary estimators yield that the set £l n 
on which \\9f — 9o\\ < 1/un satisfies P(r2 n ) — >• 1. So below we argue on the 
set Q n , or equivalently (and more conveniently) we suppose Q n = Q. Then 6 n 
converges pointwise to 9q, which belongs to all the sets D n . Set 

y? = MriM - ft), z? = - A T + y?h n , a? = \y?\h n + \z?\. 

We have af < 2un T1 h n because £l n = f2. Then an expansion of [xi^ij i— >■ 
r yi/(viu n ) Xi around (ft,^) yields for all /, 

(59) J^-^-^''-^'^ 

where 

< K\y?\h n (\z?\ + \yf\) < K\y?\h n a? < K(a?f. 
Combining (6), (39) and (40), we see that 

U(v lUn , A n ) T - F n>l (9 ) = Op«^/ 2 ). 
Since <& n {9) = Yd=i wi{U(viu n , A n ) T - F rii i(9)) 2 , we deduce 

<tv(# ) = OpK; ft ). 

Since 9q £ D n and 0^ minimizes $ n over D n , we also have $ n (# n ) = Op{un ), 

hence F n)( (0 o ) - F nj -(0 n ) = P {u n n/ ) for all /. Using (59), this can be 
rewritten as 

(60) £ r^fW - y ^ vi + <j) = °p^n Pl/2 )- 

^ («pMn) ft 

(3) Taking k between 1 and j, we consider the 2fc-dimensional vectors 
((k, n) and £(n) with components (for 1 = 1,..., 2k), 

k 



c(^^=E(^-(^-yriog^), 
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£(k,n)i 



Z?Un P \ ifl<i<Jfe, 



With matrix notation, and (56), we have n) = E(t> 2k)£,(k, n), hence 

(61) t(k,n) = Z(v 2k )- 1 ak,n). 
Next, we have 

and hence (60) and af < -ftTu^/i n < K//^ < if yield 

|C(M)ll<* E«) 2 < ft + + E a>n ft +Op(^ +1 ). 
\i=i n i=k+i J 

By (61) the variables £(k, n)i satisfy the same estimate. Since < (|£(fc, n)fc| + 
|^(/c,n) 2 fc|/i„)u^ fe , 

aZ<Ch n ( k J2K) 2 ^ + §+ E a>^)+Op(^^' +1 ) 
\i=i a ™ i=k+i J 

for some constant C. When n is large enough, C/h n < ^, and we deduce 

(62) a n k < 2Ch n (X>?)M fc_ft + E a >n fc " A ) + OpiKu^^). 

\i=l i=k+l / 

(4) In view of the definition of yf and zf, to get the result, and recalling 
that we assume /3 J+ i = /3i/2, it is clearly enough to prove the existence of 
a number v > such that, for all i = 1, .. . , j, we have 

(63) < = Op«^- ft+1 ). 

To this aim, we introduce the following property, denoted {Pm,q,r), where r 
runs through {1, . . . ,j} and m,q > 1, and where we use the notation £ r = 

Pr — Pr+1- 

(64) i = l,...,r => ar = Op(CK l " /3r+,?Cr +^ l " /3r+1 ))- 

Since a" < -ftT, applying (62) with k = 1 yields o™ = Op(h n Un~^ 2 ), which is 

Next, we suppose that (P m ,q,r) holds for some r < j, and for some m, q>l. 
Letting first k = r + 1, we deduce from (62) that, since again a™ < K, 

/ k-l 

,/3 fe -ft+2(/3 i -/3 r +gCr) ,..Pi-Pr+l\ 



i=l 
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(65) u^ + Ku^-^ 1 ] 

i=k+l / 
= P (h l n +2m (u^-^ +2 ^ + u£ + U^~^+ 2 )), 

where the last line holds because k = r + 1 and h n > 1 for n large enough 
and the sequence is decreasing. This in turn implies, for k = r + 1 again, 

(66) al = Op{h r n +2 - k+2m (u^~^ +2 ^ + u^~^)). 

Then, exactly as above, we apply (62) with k = r, and (64) and also (66) 
with k = r + 1, to get that (66) holds for k = r as well. Repeating the 
argument, a downward induction yields that indeed (66) holds for all k 
between 1 and r + 1. Thus (64) holds with q and m substituted with 2q and 
r + l + 2m. Hence (P m , q ,r) implies (P r +i+2 m ,2q,r) ■ Since obviously (P m , q , r ) 
=> (P m ,q',r) for any q' G by a repeated use of the previous argument 

we deduce that if (-P mi i jr ) holds for some m > 1, then for any q' > 1 we can 
find m(q') > 1 such that (P m (q'),q' ,r) holds as well. 

Now, assuming (P m ,q,r) f° r some m,q,r, we take g' = V 1 and m' = 
m{q'). What precedes yields (Pm',q',r), hence (65) holds for all k < r + 1, 
with q' and m' . In view of our choice of q' , this implies that (i-V+i+^i.r+i) 
holds. Since (Pi i,i) holds, we see by induction that for any r < j there exists 
m r > 1 such that (-P mr .i, r ) holds. 

It remains to apply (64) with r = j and m = m r and q = l, and we get (63) 
with v = irij . This completes the proof. 

APPENDIX E: PROOF OF THEOREM 4 

The proof of Theorem 4 is contained in the supplemental article [A'it- 
Sahalia and Jacod (2012)]. 

SUPPLEMENTARY MATERIAL 

Supplement to "Identifying the successive Blumenthal Getoor indices of 
a discretely observed process" (DOI: 10.1214/12-AOS976SUPP; .pdf). This 
supplement contains the proof of Theorem 4. 
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